Providing Cross-Lingual Information Access with Knowledge-Poor Methods

نویسندگان

  • Ralf Steinberger
  • Bruno Pouliquen
  • Camelia Ignat
چکیده

We are proposing a simple, but efficient approach for a number of multilingual and cross-lingual language technology applications that are not limited to the usual two or three languages, but that can be applied with relatively little effort to larger sets of languages. The approach consists of using existing multilingual linguistic resources such as thesauri, nomenclatures and gazetteers, as well as exploiting the existence of additional more or less language-independent text items such as dates, currency expressions, numbers, names and cognates. Mapping texts onto the multilingual resources and identifying word token links between texts in different languages are basic ingredients for applications such as cross-lingual document similarity calculation, multilingual clustering and categorisation, cross-lingual document retrieval, and tools to provide cross-lingual information access.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Dental Students’ Access, Knowledge, and Usage Regarding Information Technology in Dentistry

Abstract   Background and Aims: Information technology (IT) can make a powerful contribution to dental education and practice. The aim of the present study was to determine access, knowledge and usage of IT among dental students of Islamic Azad University of Esfahan in 2015. Materials and Methods: We conducted a cross-sectional study using a stratified random sampling method in 2016. A validat...

متن کامل

Exploiting Knowledge Bases for Multilingual and Cross-lingual Semantic Annotation and Search

The amount of entities in large knowledge bases (KBs) has been increasing rapidly, making it possible to propose new ways of intelligent information access. In addition, there is an impending need for systems that can enable multilingual and cross-lingual information access. In this work, we firstly demonstrate X-LiSA, an infrastructure for multilingual and cross-lingual semantic annotation, wh...

متن کامل

Final Report for the IPSC Exploratory Research Project Cross - lingual Indexing ( 4 / 2001 – 3 / 2003 )

Cross-lingual information access: providing content descriptors in one language for texts written in another, by assigning Eurovoc thesaurus de-scriptors automatically.

متن کامل

CLTC: A Chinese-English Cross-lingual Topic Corpus

Cross-lingual topic detection within text is a feasible solution to resolving the language barrier in accessing the information. This paper presents a Chinese-English cross-lingual topic corpus (CLTC), in which 90,000 Chinese articles and 90,000 English articles are organized within 150 topics. Compared with TDT corpora, CLTC has three advantages. First, CLTC is bigger in size. This makes it po...

متن کامل

X-LiSA: Cross-lingual Semantic Annotation

The ever-increasing quantities of structured knowledge on the Web and the impending need of multilinguality and cross-linguality for information access pose new challenges but at the same time open up new opportunities for knowledge extraction research. In this regard, cross-lingual semantic annotation has emerged as a topic of major interest and it is essential to build tools that can link wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Informatica (Slovenia)

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2004